:arXiv:hep-lat/9809168vl " 24 Sep 1998| 



2 



An improvement to the linear accept /reject algorithm 

L. Lin a , K. F. Liu b and J. Sloan b 

a Department of Physics, National Chung Hsing University, Taichung 40227, Taiwan, ROC 
b Department of Physics and Astronomy, University of Kentucky, Lexington, KY 40506, USA 

We study the improvement of Kennedy-Kuti's linear accept/reject algorithm with different ordering criterion 
and modified Bhanot-Kennedy estimator of e AH to reduce probability-bound violation. A new stochastic Monte 
Carlo algorithm to accommodate the probability-bound violation is proposed. 



1. Introduction 

Sometime ago, Kennedy and Kuti Q proposed 
a Monte Carlo algorithm which admits stochas- 
tically estimated transition probabilities as long 
as they are unbiased. This opens up the door to 
solving problems where it is feasible to estimate 
the transition probabilities but intractable or im- 
practical to calculate them exactly. 

The acceptance probability (denoted as P a 
from now on) in Kennedy-Kuti's linear algorithm 
is 



P a (Ui - U 2 ) = A+ + A" 
if f{Ux) > f(U 2 ) , 

P a (Ui -> U 2 ) = \- + A H 
if f(Ui) < f(U 2 ) , 



(1) 



(2) 



where A^ are tunable real parameters ranging 
from to 1, r denotes an unbiased estimate of 
r and AH = H{U X ) - H{U 2 ). U x denotes the old 
configuration and U 2 the new (or proposed) con- 
figuration. f(U) is some observable of the gauge 
configuration U adopted for ordering between U\ 
and U 2 . Detailed balance can be proven to be 
satisfied. 

But there is a major problem with the linear 
algorithm. The value of P a can violate the prob- 
ability bound [0, 1] since it is estimated stochas- 
tically. Once the probability bound restriction is 
not respected, detailed balance is destroyed and 
systematic bias will show up. Hopefully, if the 
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bound violation occurs very rarely (e.g. once ev- 
ery one million times), then the systematic bias 
might be very small and the expectation values 
of various quantities can still be correct within 
statistical errors |l| . 

Within the framework of the linear algorithm, 
there are at least three ways to reduce the prob- 
ability of bound violations: 

(1) In general, the two tunable parameters A 
can be parameterized as 



A+ = 0.0 , A" = 



1 



1 



(3) 



where < e < oo. So e AH is allowed to fluctu- 
ate between and 1 + e. If we increase e, then 
the allowed range of e AH will be larger and the 
probability of bound violations can be reduced 
(although the intrinsic acceptance rate will be re- 
duced simultaneously). 

(2) One can choose a better ordering criterion 
to reduce the bound violation. One can imag- 
ine that when the ordering criterion is uncorre- 
lated with AH , then the upper bound will be vi- 
olated quite frequently (about 50% of the time for 
A~ = 1). Thus, we believe the "ideal" ordering 
criterion is AH itself. However, we cannot cal- 
culate AH exactly (otherwise we would not need 
the stochastic estimator); the best we can do is 
to estimate AH stochastically (without bias). We 
will obtain a number (denoted as xq) which can 
be made reasonably close to the true value of AH, 
and use it as the ordering criterion. This should 
greatly reduce the probability of upper-bound vi- 
olations. 
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(3) Most importantly, one can reduce the vari- 
ance of the estimated acceptance probability ei- 
ther by brute force or by constructing a bet- 
ter (unbiased) stochastic estimator. This will 
help reduce the probability of lower-bound viola- 
tions and further reduce upper-bound violations 
as well. 

Although one can improve the performance of 
the linear algorithm with these techniques, there 
are still problems inherent to the algorithm which 
are impossible to eradicate. First of all, if we as- 
sume that the estimator of the acceptance prob- 
ability has a Gaussian distribution (which should 
be a reasonable assumption), then no matter how 
hard we work, we will never completely exclude 
the bound violations since the two long tails of 
the Gaussian distribution always exist. Secondly, 
the linear algorithm with a stochastic estimator 
is a volume-squared algorithm. This can be eas- 
ily seen by the following consideration. The vari- 
ance of the estimated acceptance probability is 
roughly proportional to 5 2 /N where 5 2 is the 
intrinsic variance and N is the number of hits. 
(This proportionality relation is true no matter 
what unbiased estimator we choose.) The intrin- 
sic variance S 2 is proportional to the lattice size. 
Therefore, if we want to keep the bound viola- 
tions under very tight control, we need to work 
harder and harder (i.e. TV should become larger 
and larger) as the lattice size grows. So N should 
grow as the first power of the volume V , and the 
cost will be proportional to V 2 since the cost of 
the stochastic estimator is usually proportional 
to V. In real simulations on lattice QCD with dy- 
namical fermions, to lessen the probability bound 
violation in this way could be very costly. 

In order to completely remove any systematic 
bias coming from bound violations, and to reduce 
the cost of simulation on larger lattices, we need 
to go beyond the linear algorithm and come up 
with some new method. In the next section, we 
propose a new algorithm which will achieve these 
goals. We will see that the new algorithm elimi- 
nates the upper bound violation and absorbs the 
negative sign of the lower bound violation into the 
observables by introducing some auxiliary fields 
and going back to the Metropolis accept/reject. 



2. The New Algorithm 

The first step toward the new algorithm is to 
transform the "noise" (coming from the stochas- 
tic estimator) into "auxiliary fields", just as 
pseudofermion fields are introduced as auxiliary 
fields. Therefore, the configuration space is en- 
larged, and the updating and the accept/reject 
will be carried out in this enlarged space. To be 
more precise, the partition function Z is 

Z = j[DU}e- SG ^e- s "^ 

= J[DU][Dp]e- H ^e- s ^ 

= J [DU}[Dp][DZ}P £ (Oe- H V>rt f(U,Z),(4) 

where U represents the original field (not nec- 
essarily the link variable), p's are the conjugate 
momenta introduced in the Hybrid Monte Carlo 
(HMC) algorithm Q in case the molecular dy- 
namics is used for updating. H(U,p) = \ + 
Sg(U) is the hamiltonian and £'s are the auxil- 
iary fields "representing" the stochastic fields. P £ 
is the probability distribution for £'s. The func- 
tion f(U,£) is any function which satisfies 

e~ SF{U) = J[Dt]P&) f(U,0, (5) 

i.e. f(U,£) is an unbiased estimate of e~ SF ^ for 
noise £. 

This completes the first step toward the new 
algorithm. The space is indeed enlarged. Orig- 
inally, we had a configuration space of the IPs. 
HMC enlarges the space to (U,p). Now it is fur- 
ther enlarged to (U,p, £) with the inclusion of the 
stochastic field £. This first step actually sets up 
the platform for performing updates in the en- 
larged space. So from now on, when we specify 
a state, we are specifying a state in this enlarged 
space. 

One should also notice that everything is gen- 
eral so far. We have not written down a spe- 
cific functional form for /(£/,£) yet, nor have we 
decided what P^ should be. So this means the 
new algorithm can be applied to a general class 
of problems. It is not restricted to a specific type 
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of problem. The auxiliary fields can have arbi- 
trary distributions (e.g. Gaussian, Z 2 etc.) 

The second step of this new algorithm is to ob- 
serve that 

f(U,0 = sign(/) |/(£/,£) | . (6) 

Therefore sign(/), which means the sign of the 
function /, is a state function. We then can do 
the following: 



(A) 



[DU][D P m]P s (0 
A(U) S ign(f)e- H ^\f(U,0\, 



(J) 



where A is the quantity we want to measure. 
We now reinterpret |/|, which is positive defi- 
nite, as part of the probability distribution. So 
we are measuring the quantity A(U)sign(f) under 
the probability distribution e~ HtyU ' p ^ \f(U, 77, p)\ 
and the partition function Z = (sign(/)}. This 
in principle takes care of the lower probability- 
bound violations. (Actually they are is replaced 
with a potential sign problem.) Notice that this 
rcinterpretation is possible because the sign of 
f(U,r],p) is a state function. In the linear ac- 
cept/reject, one calculates the transition proba- 
bility directly (and stochastically). There, the 
transition probability is not a state function. 
When we obtain a minus sign there, it can not 
be swept into the observable as in Eq. (0). 

The third part of the new algorithm is to go 
back to Metropolis to do the accept /reject. There 
are two accept/reject steps. The first one is to 
propose updating of U and p in the usual way, 
e.g. molecular dynamics evolution while keeping 
the stochastic field £ fixed. The acceptance prob- 
ability P a is 



P a (U 1 , Pl ,^U 2 , P2 ,0 = 
e-"(^)|/([/ 2 ,0] 



V 



(8) 



The second accept/reject step refreshes the 
stochastic field £ according to the probability dis- 
tribution P^(0 while keeping U and p fixed. The 
acceptance probability in then 



1 



\f(u,&\ 



(9) 



Table 1 

Data of average energy obtained by using linear 
and new algorithms, e = in this case. 





Linear 


New 


1.0 


0.18009(14) 


0.17994(14) 


3.0 


0.18127(14) 


0.17999(14) 


6.0 


0.18216(14) 


0.18014(14) 


10.0 


0.18257(14) 


0.18005(14) 


20.0 


0.18342(14) 


0.17995(14) 



Obviously, there is no probability-bound violation 
in either of these Metropolis accept /reject steps. 

Detailed balance can be proven to be satisfied. 
It is obvious that all the techniques for reducing 
the variance of the estimator developed before can 
be applied here. 

We have tested the new algorithm on a very 
simple 5-state model. (This is the same model 
used in |Q for demonstration.) We calculated the 
average energy with the linear algorithm and the 
new algorithm. Some data are presented in table 
1. (The exact value is 0.180086.) 

We see that as the intrinsic variance of the es- 
timator grows, the linear algorithm introduces a 
systematic error (mostly due to upper-bound vi- 
olations). The new algorithm, however, still gives 
the correct value within errors. 

To apply this new algorithm to the dynamical 
fermion problem, we note that the fermion de- 
terminant can be calculated stochastically as a 
random walk process [Q 

T ,i n „ , , Tr In M, TrhxM , 
e TrlnM = l+TrlnM(l+ (1+ (..., 

which can in turn be written in the following in- 
tegral 



Tr In M 



= [ f[V m S(\ m \-l) f f[Vp v 

J i=l J n=l 



[1 + n{ In Af 771(1 + 9(p 2 - 1/2)4 In A/772 
(l+0(p3-2/3)r?!lnM77 3 ( ] 

This sequence terminates stochastically in finite 
time and only the seeds from the pseudo-random 
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number generator need to be stored in practice. 
Comparing this to Eq. (||), the function /(£/, ry, p) 
(the £ in Eq. (||) is represented by two stochastic 
fields r\ and p here) can be defined for the fermion 
determinant. One can then use the efficient Pade- 
Z 2 algorithm || to calculate /({/, 7y,p). 

While we are in the process of applying this 
algorithm to QCD, we have considered how effi- 
ciently the HMC with pseudofermion is as far as 
updating TrlnM is concerned. To this end, we 
have run the HMC with pseudofermions on an 
8 3 x 18 lattice at = 5.6 and k = 0.154. 40 steps 
are taken with step size of 0.025 in 170 molecu- 
lar dynamics trajectories. We calculated the cor- 
relation between the change of TrlnM and the 
change of the pseudofermion action Sf along a 
trajectory. The latter is designed to emulate the 
former. We find the normalized correlation is 

(ATr In M AS F ) = 0.053 ± 0.133, (10) 

which is quite small. This raises a question as 
to how efficient the HMC with pseudofermions 
might be in updating the fermion determinant. 
Further study of the autocorrelation of Tr In M 
is needed to answer this question. 

3. Summary and Discussion 

In summary, the new algorithm solves the prob- 
lem of probability-bound violations which has 
been troubling the linear accept/reject algorithm. 
The upper-bound violation problem is solved by 
going back to the Metropolis accept/reject (in 
the enlarged space). The lower-bound violation 
problem is solved by grouping the sign with the 
observable. With the probability bound viola- 
tion problem solved, we do not have to have an 
extremely small variance in the stochastic esti- 
mation. So, in principle, the volume-squared be- 
haviour of the algorithm is less sever and hope- 
fully can be put under control. 
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